Text simplification using synchronous dependency grammars: Generalising automatically harvested rules
نویسندگان
چکیده
We present an approach to text simplification based on synchronous dependency grammars. Our main contributions in this work are (a) a study of how automatically derived lexical simplification rules can be generalised to enable their application in new contexts without introducing errors, and (b) an evaluation of our hybrid system that combines a large set of automatically acquired rules with a small set of hand-crafted rules for common syntactic simplification. Our evaluation shows significant improvements over the state of the art, with scores comparable to human simplifications.
منابع مشابه
Hybrid text simplification using synchronous dependency grammars with hand-written and automatically harvested rules
We present an approach to text simplification based on synchronous dependency grammars. The higher level of abstraction afforded by dependency representations allows for a linguistically sound treatment of complex constructs requiring reordering and morphological change, such as conversion of passive voice to active. We present a synchronous grammar formalism in which it is easy to write rules ...
متن کاملAutomatic Learning of Parallel Dependency Treelet Pairs
Induction of synchronous grammars from empirical data has long been a problem unsolved; despite that generative synchronous grammars theoretically suit the machine translation task very well. This fact is mainly due to pervasive structural divergences between languages. This paper presents a statistical approach to learn dependency structure mappings from parallel corpora. The algorithm introdu...
متن کاملQuasi-Synchronous Phrase Dependency Grammars for Machine Translation
We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text u...
متن کاملReestimation and Best-First Parsing Algorithm for Probabilistic Dependency Grammars
This paper presents a reesthnation algorithm and a best-first parsing (BFP) algorithm for probabilistic dependency grummars (PDG). The proposed reestimation algorithm is a variation of the inside-outside algorithm adapted to probabilistic dependency grammars. The inside-outside algorithm is a probabilistic parameter reestimation algorithm for phrase structure grammars in Chomsky Normal Form (CN...
متن کاملAutomatic induction of rules for text simplification
Long and complicated sentences pose various problems to many state-of-the-art natural language technologies. We have been exploring methods to automatically transform such sentences as to make them simpler. These methods involve the use of a rule-based system, driven by the syntax of the text in the domain of interest. Hand-crafting rules for every domain is time-consuming and impractical. This...
متن کامل